rank | frequency | n-gram |
---|---|---|
1 | 6089 | -n |
2 | 4547 | -e |
3 | 2697 | -t |
4 | 2556 | -s |
5 | 2321 | -g |
rank | frequency | n-gram |
---|---|---|
1 | 5253 | -en |
2 | 1719 | -ng |
3 | 1096 | -de |
4 | 1030 | -er |
5 | 747 | -ie |
rank | frequency | n-gram |
---|---|---|
1 | 1632 | -ing |
2 | 922 | -ten |
3 | 802 | -gen |
4 | 714 | -den |
5 | 515 | -ren |
rank | frequency | n-gram |
---|---|---|
1 | 487 | -ngen |
2 | 344 | -ende |
3 | 331 | -eren |
4 | 272 | -ring |
5 | 254 | -atie |
rank | frequency | n-gram |
---|---|---|
1 | 431 | -ingen |
2 | 203 | -ering |
3 | 154 | -elijk |
4 | 153 | -lijke |
5 | 150 | -ische |
The tables show the most frequent letter-N-grams at the ending of words for N=1…5. Everything runs in parallel to 2.2.5 Most frequent word beginnings. The aim is suffix detection instead of affix detection.
For N=3:
SELECT @pos:=(@pos+1), xx.* from (SELECT @pos:=0) r, (select count(*) as cnt ,concat("-", right(word,3)) FROM words WHERE w_id>100 group by right(word,3) order by cnt desc) xx limit 5;
2.2.5 Most frequent word beginnings